Lifelog Scene Change Detection Using Cascades of Audio and Video Detectors

نویسندگان

  • Katariina Mahkonen
  • Joni-Kristian Kämäräinen
  • Tuomas Virtanen
چکیده

The advent of affordable wearable devices with a video camera has established the new form of social data, lifelogs, where lives of people are captured to video. Enormous amount of lifelog data and need for on-site processing demand new fast video processing methods. In this work, we experimentally investigate seven hours of lifelogs and point out novel findings: 1) audio cues are exceptionally strong for lifelog processing; 2) cascades of audio and video detectors improve accuracy and enable fast (super frame rate) processing speed. We first construct strong detectors using state-of-the-art audio and visual features: Mel-frequency cepstral coefficients (MFCC), colour (RGB) histograms, and local patch descriptors (SIFT). In the second stage, we construct a cascade of the trained detectors and optimise cascade parameters. Separating the detector and cascade optimisation stages simplify training and results to a fast and accurate processing pipeline.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Compressed Domain Scene Change Detection Based on Transform Units Distribution in High Efficiency Video Coding Standard

Scene change detection plays an important role in a number of video applications, including video indexing, searching, browsing, semantic features extraction, and, in general, pre-processing and post-processing operations. Several scene change detection methods have been proposed in different coding standards. Most of them use fixed thresholds for the similarity metrics to determine if there wa...

متن کامل

Unsupervised Emotional Scene Detection from Lifelog Videos Using Cluster Ensembles

An emotional scene detection method is proposed in order to retrieve impressive scenes from lifelog videos. The proposed method is based on facial expression recognition considering that a wide variety of facial expression could be observed in impressive scenes. Conventional facial expression techniques, which focus on discriminating typical facial expressions, will be inadequate for lifelog vi...

متن کامل

RUCMM at MediaEval 2015 Affective Impact of Movies Task: Fusion of Audio and Visual Cues

This paper summarizes our efforts for the first time participation in the Violent Scene Detection subtask of the MediaEval 2015 Affective Impact of Movies Task. We build violent scene detectors using both audio and visual cues. In particular, the audio cue is represented by bag-of-audio-words with fisher vector encoding. The visual cue is exploited by extracting CNN features from video frames. ...

متن کامل

Fire detection using video sequences in urban out-door environment

Nowadays automated early warning systems are essential in human life. One of these systems is fire detection which plays an important role in surveillance and security systems because the fire can spread quickly and cause great damage to an area. Traditional fire detection methods usually are based on smoke and temperature detectors (sensors). These methods cannot work properly in large space a...

متن کامل

Scene change detection by audio and video clues

Automatic video scene change detection is a challenging task. Using audio or visual information alone often cannot provide a satisfactory solution. However, how to combine audio and visual information efficiently still remains a difficult issue since there are various cases in their relationship due to the versatility of videos. In this paper, we present an effective scene change detection meth...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014